# Human Pose Estimation

Vitpose Plus Huge
Apache-2.0
ViTPose++ is a vision Transformer-based foundational model for human pose estimation, achieving an outstanding performance of 81.1 AP on the MS COCO keypoint test set.
Pose Estimation Transformers
V
usyd-community
14.49k
6
Vitpose Plus Large
Apache-2.0
ViTPose++ is a vision Transformer-based foundation model for human pose estimation, achieving an outstanding performance of 81.1 AP on the MS COCO keypoint test set.
Pose Estimation Transformers
V
usyd-community
1,731
1
Vitpose Plus Small
Apache-2.0
ViTPose++ is a vision Transformer-based human pose estimation model, achieving outstanding performance of 81.1 AP on the MS COCO keypoint detection benchmark.
Pose Estimation Transformers
V
usyd-community
30.02k
2
Vitpose Plus Base
Apache-2.0
ViTPose is a vision Transformer-based human pose estimation model that achieves an outstanding performance of 81.1 AP on the MS COCO keypoint detection benchmark with a simple design.
Pose Estimation Transformers English
V
usyd-community
22.26k
10
Vitpose Base Coco Aic Mpii
Apache-2.0
ViTPose is a human pose estimation model based on Vision Transformer, achieving outstanding performance on benchmarks like MS COCO through simple architectural design.
Pose Estimation Transformers English
V
usyd-community
38
1
Vitpose Base
Apache-2.0
A vision Transformer-based human pose estimation model achieving an outstanding performance of 81.1 AP on the MS COCO keypoint test set
Pose Estimation Transformers English
V
usyd-community
761
9
Vitpose Base Simple
Apache-2.0
ViTPose is a human pose estimation model based on Vision Transformer, achieving 81.1 AP accuracy on the MS COCO keypoint test set, with advantages such as model simplicity, scalable size, and flexible training.
Pose Estimation Transformers English
V
usyd-community
51.40k
20
Vitpose Base Simple
Apache-2.0
ViTPose is a baseline model for human pose estimation based on plain vision transformers, achieving high-performance keypoint detection with a simple architecture
Pose Estimation Transformers English
V
danelcsb
20
1
Vitpose
This model is used to detect keypoints in images or videos, suitable for tasks such as human pose estimation and facial landmark detection.
Pose Estimation Transformers
V
shauray
19
0
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase